Japanese Zero Pronoun Resolution based on Ranking Rules and Machine Learning

نویسندگان

  • Hideki Isozaki
  • Tsutomu Hirao
چکیده

Anaphora resolution is one of the most important research topics in Natural Language Processing. In English, overt pronouns such as she and definite noun phrases such as the company are anaphors that refer to preceding entities (antecedents). In Japanese, anaphors are often omitted, and these omissions are called zero pronouns. There are two major approaches to zero pronoun resolution: the heuristic approach and the machine learning approach. Since we have to take various factors into consideration, it is difficult to find a good combination of heuristic rules. Therefore, the machine learning approach is attractive, but it requires a large amount of training data. In this paper, we propose a method that combines ranking rules and machine learning. The ranking rules are simple and effective, while machine learning can take more factors into account. From the results of our experiments, this combination gives better performance than either of the two previous approaches.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Utilizing Features of Verbs in Statistical Zero Pronoun Resolution for Japanese Speech

This paper proposes a statistical zero pronoun resolution method that utilizes features of verbs. In Japanese speech, the subject is often omitted, especially when it is the first person. To resolve such zero pronouns, features related to the verbs such as functional expressions play important roles. However, recent state-of-the-art zero-pronoun resolution systems lack these features because th...

متن کامل

Zero Pronoun Resolution can Improve the Quality of J-E Translation

In Japanese, particularly, spoken Japanese, subjective, objective and possessive cases are very often omitted. Such Japanese sentences are often translated by Japanese-English statistical machine translation to the English sentence whose subjective, objective and possessive cases are omitted, and it causes to decrease the quality of translation. We performed experiments of J-E phrase based tran...

متن کامل

Capturing Salience with a Trainable Cache Model for Zero-anaphora Resolution

This paper explores how to apply the notion of caching introduced by Walker (1996) to the task of zero-anaphora resolution. We propose a machine learning-based implementation of a cache model to reduce the computational cost of identifying an antecedent. Our empirical evaluation with Japanese newspaper articles shows that the number of candidate antecedents for each zero-pronoun can be dramatic...

متن کامل

A Deep Neural Network for Chinese Zero Pronoun Resolution

This paper investigates the problem of Chinese zero pronoun resolution. Most existing approaches are based on machine learning algorithms, using hand-crafted features, which is labor-intensive. Moreover, semantic information that is essential in the resolution of noun phrases has not been addressed enough by previous approaches on zero pronoun resolution. This is because that zero pronouns have...

متن کامل

Identification and Resolution of Chinese Zero Pronouns: A Machine Learning Approach

In this paper, we present a machine learning approach to the identification and resolution of Chinese anaphoric zero pronouns. We perform both identification and resolution automatically, with two sets of easily computable features. Experimental results show that our proposed learning approach achieves anaphoric zero pronoun resolution accuracy comparable to a previous state-ofthe-art, heuristi...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2003